Effects of domain characteristics on instance-based learning algorithms
نویسندگان
چکیده
This paper presents average-case analyses of instance-based learning algorithms. The algorithms analyzed employ a variant of k-nearest neighbor classi-er (k-NN). Our analysis deals with a monotone m-of-n target concept with irrelevant attributes, and handles three types of noise: relevant attribute noise, irrelevant attribute noise, and class noise. We formally represent the expected classi-cation accuracy of k-NN as a function of domain characteristics including the number of training instances, the number of relevant and irrelevant attributes, the threshold number in the target concept, the probability of each attribute, the noise rate for each type of noise, and k. We also explore the behavioral implications of the analyses by presenting the e ects of domain characteristics on the expected accuracy of k-NN and on the optimal value of k for arti-cial domains. c © 2002 Elsevier Science B.V. All rights reserved.
منابع مشابه
Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملImage Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملSample-oriented Domain Adaptation for Image Classification
Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. The conventional image processing algorithms cannot perform well in scenarios where the training images (source domain) that are used to learn the model have a different distribution with test images (target domain). Also, many real world applicat...
متن کاملSecond-Order Statistical Texture Representation of Asphalt Pavement Distress Images Based on Local Binary Pattern in Spatial and Wavelet Domain
Assessment of pavement distresses is one of the important parts of pavement management systems to adopt the most effective road maintenance strategy. In the last decade, extensive studies have been done to develop automated systems for pavement distress processing based on machine vision techniques. One of the most important structural components of computer vision is the feature extraction met...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Theor. Comput. Sci.
دوره 298 شماره
صفحات -
تاریخ انتشار 2003